{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/node_postprocessor/CohereRerank.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Cohere Rerank"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.3.2\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.0\u001b[0m\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
      "Note: you may need to restart the kernel to use updated packages.\n"
     ]
    }
   ],
   "source": [
    "%pip install llama-index > /dev/null\n",
    "%pip install llama-index-postprocessor-cohere-rerank > /dev/null"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import VectorStoreIndex, SimpleDirectoryReader\n",
    "from llama_index.core.response.pprint_utils import pprint_response"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Download Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--2024-05-09 17:56:26--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt\n",
      "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 2606:50c0:8003::154, 2606:50c0:8000::154, 2606:50c0:8002::154, ...\n",
      "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|2606:50c0:8003::154|:443... connected.\n",
      "HTTP request sent, awaiting response... 200 OK\n",
      "Length: 75042 (73K) [text/plain]\n",
      "Saving to: ‘data/paul_graham/paul_graham_essay.txt’\n",
      "\n",
      "data/paul_graham/pa 100%[===================>]  73.28K  --.-KB/s    in 0.009s  \n",
      "\n",
      "2024-05-09 17:56:26 (7.81 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "!mkdir -p 'data/paul_graham/'\n",
    "!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# load documents\n",
    "documents = SimpleDirectoryReader(\"./data/paul_graham/\").load_data()\n",
    "\n",
    "# build index\n",
    "index = VectorStoreIndex.from_documents(documents=documents)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Retrieve top 10 most relevant nodes, then filter with Cohere Rerank"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "from llama_index.postprocessor.cohere_rerank import CohereRerank\n",
    "\n",
    "\n",
    "api_key = os.environ[\"COHERE_API_KEY\"]\n",
    "cohere_rerank = CohereRerank(api_key=api_key, top_n=2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine = index.as_query_engine(\n",
    "    similarity_top_k=10,\n",
    "    node_postprocessors=[cohere_rerank],\n",
    ")\n",
    "response = query_engine.query(\n",
    "    \"What did Sam Altman do in this essay?\",\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Final Response: Sam Altman was asked if he wanted to be the president\n",
      "of Y Combinator. Initially, he declined as he wanted to start a\n",
      "startup focused on making nuclear reactors. However, after persistent\n",
      "persuasion, he eventually agreed to become the president of Y\n",
      "Combinator starting with the winter 2014 batch.\n",
      "______________________________________________________________________\n",
      "Source Node 1/2\n",
      "Node ID: 7ecf4eb2-215d-45e4-ba08-44d9219c7fa6\n",
      "Similarity: 0.93033177\n",
      "Text: When I was dealing with some urgent problem during YC, there was\n",
      "about a 60% chance it had to do with HN, and a 40% chance it had do\n",
      "with everything else combined. [17]  As well as HN, I wrote all of\n",
      "YC's internal software in Arc. But while I continued to work a good\n",
      "deal in Arc, I gradually stopped working on Arc, partly because I\n",
      "didn't have t...\n",
      "______________________________________________________________________\n",
      "Source Node 2/2\n",
      "Node ID: 88be17e9-e0a0-49e1-9ff8-f2b7aa7493ed\n",
      "Similarity: 0.86269903\n",
      "Text: Up till that point YC had been controlled by the original LLC we\n",
      "four had started. But we wanted YC to last for a long time, and to do\n",
      "that it couldn't be controlled by the founders. So if Sam said yes,\n",
      "we'd let him reorganize YC. Robert and I would retire, and Jessica and\n",
      "Trevor would become ordinary partners.  When we asked Sam if he wanted\n",
      "to...\n"
     ]
    }
   ],
   "source": [
    "pprint_response(response, show_source=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Directly retrieve top 2 most similar nodes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine = index.as_query_engine(\n",
    "    similarity_top_k=2,\n",
    ")\n",
    "response = query_engine.query(\n",
    "    \"What did Sam Altman do in this essay?\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Retrieved context is irrelevant and response is hallucinated."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Final Response: Sam Altman was asked to become the president of Y\n",
      "Combinator, initially declined the offer to pursue starting a startup\n",
      "focused on nuclear reactors, but eventually agreed to take over\n",
      "starting with the winter 2014 batch.\n",
      "______________________________________________________________________\n",
      "Source Node 1/2\n",
      "Node ID: 7ecf4eb2-215d-45e4-ba08-44d9219c7fa6\n",
      "Similarity: 0.8308840369082053\n",
      "Text: When I was dealing with some urgent problem during YC, there was\n",
      "about a 60% chance it had to do with HN, and a 40% chance it had do\n",
      "with everything else combined. [17]  As well as HN, I wrote all of\n",
      "YC's internal software in Arc. But while I continued to work a good\n",
      "deal in Arc, I gradually stopped working on Arc, partly because I\n",
      "didn't have t...\n",
      "______________________________________________________________________\n",
      "Source Node 2/2\n",
      "Node ID: 88be17e9-e0a0-49e1-9ff8-f2b7aa7493ed\n",
      "Similarity: 0.8230144027954406\n",
      "Text: Up till that point YC had been controlled by the original LLC we\n",
      "four had started. But we wanted YC to last for a long time, and to do\n",
      "that it couldn't be controlled by the founders. So if Sam said yes,\n",
      "we'd let him reorganize YC. Robert and I would retire, and Jessica and\n",
      "Trevor would become ordinary partners.  When we asked Sam if he wanted\n",
      "to...\n"
     ]
    }
   ],
   "source": [
    "pprint_response(response, show_source=True)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
